Scalability! But at what COST?
نویسندگان
چکیده
We offer a new metric for big data platforms, COST, or the Configuration that Outperforms a Single Thread. The COST of a given platform for a given problem is the hardware configuration required before the platform outperforms a competent single-threaded implementation. COST weighs a system’s scalability against the overheads introduced by the system, and indicates the actual performance gains of the system, without rewarding systems that bring substantial but parallelizable overheads. We survey measurements of data-parallel systems recently reported in SOSP and OSDI, and find that many systems have either a surprisingly large COST, often hundreds of cores, or simply underperform one thread for all of their reported configurations.
منابع مشابه
Ownership in Name, But not Necessarily in Action; Comment on “It’s About the Idea Hitting the Bull’s Eye”: How Aid Effectiveness Can Catalyse the Scale-up of Health Innovations”
A recently-published paper by Wickremasinghe et al assesses the scalability of pilot projects in three countries using the aid effectiveness agenda as an analytical framework. The authors report uneven progress and recommend applying aid effectiveness principles to improve the scalability of projects. This commentary focuses on one key principle of aid effectiveness – country ownership; it desc...
متن کاملDynamic configuration and collaborative scheduling in supply chains based on scalable multi-agent architecture
Due to diversified and frequently changing demands from customers, technological advances and global competition, manufacturers rely on collaboration with their business partners to share costs, risks and expertise. How to take advantage of advancement of technologies to effectively support operations and create competitive advantage is critical for manufacturers to survive. To respond to these...
متن کاملMulti-objective and Scalable Heuristic Algorithm for Workflow Task Scheduling in Utility Grids
To use services transparently in a distributed environment, the Utility Grids develop a cyber-infrastructure. The parameters of the Quality of Service such as the allocation-cost and makespan have to be dealt with in order to schedule workflow application tasks in the Utility Grids. Optimization of both target parameters above is a challenge in a distributed environment and may conflict one an...
متن کاملRAIDb: Redundant Array of Inexpensive Databases
Clusters of workstations become more and more popular to power data server applications such as large scale Web sites or e-Commerce applications. There has been much research on scaling the front tiers (web servers and application servers) using clusters, but databases usually remain on large dedicated SMP machines. In this paper, we address database performance scalability and high availabilit...
متن کاملQuerying the Internet with PIER
Achieving scalability is one of the goals of the database research community at present. The Internet is estimated to have a few hundreds of nodes, yet the largest database systems in the world scale up to at most a few hundred nodes. Supporting large databases is still a challenge because of the lack in the degree of distribution. The main goal is for databases to scale over Internet, thus mak...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015